Feature-Rich Translation by Quasi-Synchronous Lattice Parsing
نویسندگان
چکیده
We present a machine translation framework that can incorporate arbitrary features of both input and output sentences. The core of the approach is a novel decoder based on lattice parsing with quasisynchronous grammar (Smith and Eisner, 2006), a syntactic formalism that does not require source and target trees to be isomorphic. Using generic approximate dynamic programming techniques, this decoder can handle “non-local” features. Similar approximate inference techniques support efficient parameter estimation with hidden variables. We use the decoder to conduct controlled experiments on a German-to-English translation task, to compare lexical phrase, syntax, and combined models, and to measure effects of various restrictions on nonisomorphism.
منابع مشابه
Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
We present a quasi-synchronous dependency grammar (Smith and Eisner, 2006) for machine translation in which the leaves of the tree are phrases rather than words as in previous work (Gimpel and Smith, 2009). This formulation allows us to combine structural components of phrase-based and syntax-based MT in a single model. We describe a method of extracting phrase dependencies from parallel text u...
متن کاملDiscriminative Feature-Rich Modeling for Syntax-Based Machine Translation
State-of-the-art statistical machine translation systems are most frequently built on phrasebased (Koehn et al., 2003) or hierarchical translation models (Chiang, 2005). In addition, a wide variety of models exploiting syntactic annotation on either the source or target side (or both) have recently been developed and also give state-of-the-art performance (Galley et al., 2006; Zollmann and Venu...
متن کاملPhrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features
Recent research has shown clear improvement in translation quality by exploiting linguistic syntax for either the source or target language. However, when using syntax for both languages (“tree-to-tree” translation), there is evidence that syntactic divergence can hamper the extraction of useful rules (Ding and Palmer 2005). Smith and Eisner (2006) introduced quasi-synchronous grammar, a formal...
متن کاملQuasi-Synchronous Dependence Model for Information Retrieval
Incorporating syntactic features in a retrieval model has had very limited success in the past, with the exception of term dependencies. This paper presents a new term dependency modeling approach based on a dependency parsing technique used for both queries and documents. Our model is inspired by a quasi-synchronous stochastic process for machine translation [21]. It describes four different t...
متن کاملJoshua 2.0: A Toolkit for Parsing-Based Machine Translation with Syntax, Semirings, Discriminative Training and Other Goodies
We describe the progress we have made in the past year on Joshua (Li et al., 2009a), an open source toolkit for parsing based machine translation. The new functionality includes: support for translation grammars with a rich set of syntactic nonterminals, the ability for external modules to posit constraints on how spans in the input sentence should be translated, lattice parsing for dealing wit...
متن کامل